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Abstract Rewards and penalties are common practical tools that can be used to pro- 
mote cooperation in social institutions. The evolution of cooperation under reward 
and punishment incentives in joint enterprises has been formalized and investigated, 
mostly by using compulsory public good games. Recently, Sasaki et al. (2012, Proc 
Natl Acad Sci USA 109:1 165-1 169) considered optional participation as well as in- 
stitutional incentives and described how the interplay between these mechanisms af- 
fects the evolution of cooperation in public good games. Here, we present a full clas- 
sification of these evolutionary dynamics. Specifically, whenever penalties are large 
enough to cause the bi-stability of both cooperation and defection in cases in which 
participation in the public good game is compulsory, these penalties will ultimately 
result in cooperation if participation in the public good game is optional. The global 
stability of coercion-based cooperation in this optional case contrasts strikingly with 
the bi-stability that is observed in the compulsory case. We also argue that optional 
participation is not so effective at improving cooperation under rewards. 
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1 Introduction 

Self-interest often leads to freeloading on the contributions of others in the dynamics 
associated with common goods and joint enterprises [1,2]. As is well known, incen- 
tivization, such as rewarding and punishing, is a popular method for harnessing the 
selfish action and for motivating individuals to behave cooperatively [3-13]. Exper- 
imental and theoretical studies on joint enterprises under various incentive schemes 
are growing [14-22]. 

Obviously, whether rewards or penalties, sufficiently large incentives can trans- 
form freeloaders into full cooperators, and incentives with small impact do nothing 
on the outcomes [22]. However, incentivizing is costly, and such heavy incentives 
often incurs serious costs on those who provide the incentives, whether in a peer- 
to-peer or institutional manner. Previous game-theoretic studies on the evolution of 
cooperation with incentives have focused on public good games with compulsory par- 
ticipation, and revealed that the intermediate degrees of punishment lead to a couple 
of stable equilibria, full defection and full cooperation [4,5,10,13,22,23]. In this bi- 
stable dynamic, establishing full cooperation requires an initially sufficient fraction 
of cooperators, or ex ante adjustment to overcome the initial condition [10,23]. This 
situation is a coordination game [24], which is a model of great interest for analyzing 
a widespread coordination problem (e.g., in choosing distinct technical standards). 

In contrast to a traditional case with compulsory participation, another approach 
to the evolution of cooperation is an option to opt out of joint enterprises [25-37]. 
The opting-out option can make the freeloader problem relaxed: individuals can exit 
a joint venture when stuck in a state in which all freeload off one another ("economic 
stalemate"), and then pursue a stand-alone project; if a joint venture with mutual co- 
operation is more profitable than in isolation, the individuals once exited will switch 
to contributing to the venture. This situation, however, will also find defection attrac- 
tive. Thus, joint enterprises with optional participation can give rise to a rock-paper- 
scissors cycle [28-31]. 

Recently, Sasaki et al. [22] revealed that considering optional participation as well 
as institutional incentives can effect fully cooperative outcomes for the intermediate 
ranges of incentives. They demonstrated that opting-out combined with rewarding 
is not very effective at establishing full cooperation, but opting-out combined with 
punishment is very effective at establishing cooperation. Although there are a series 
of existing papers on the interplay of punishment and opting-out mechanisms [38- 
44], the main points of these earlier studies comprise solving the puzzling issue of 
second-order freeloading: the exploitation of the efforts of others to uphold incentives 
for cooperation [2, 4, 7, 45, 46]. Sasaki et al. [22] consider incentives controlled 
exclusively by a centralized authority (like the empire or state) [47-50], and thus, 
their model is already free from the second-order freeloader problem. 

Here we analytically provide a full classification of the replicator dynamics in a 
public good game with institutional incentives and optional participation. We clarify 
when and how cooperation can be selected over defection in a bi-stable situation as- 
sociated with institutional punishment without requiring any ability to communicate 
among individuals. In particular, assuming that the penalties are large enough to cause 
bi-stability with both full cooperation and full defection (no matter what the basins 
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of attraction are) in cases of compulsory participation, cooperation will necessarily 
become selected in the long term, regardless of the initial conditions. 

2 Model 

2. 1 Social dilemmas 

To describe our institutional-incentive model, we start from public good games with 
group size n > 2. The n players in a group are given the opportunity to participate in 
a public good game. We assume that participation pays a fixed entrance fee a > to 
the sanctioning institution, whereas non-participation yields nothing. We denote by 
m the number of players who are willing to participate (0 < m < n) and assume that at 
least two participants are required for the game to occur [28,39-42]. If the game does 
take place, each of the m participants in the group can decide whether to invest a fixed 
amount c> into a common pool, knowing that each contribution will be multiplied 
by r > 1 and then shared equally among all m — 1 other participants in the group. 
Thus, participants have no direct gain from their own investments [6,41^-3,45]. If all 
of the participants invest, they obtain a net payoff (r — l)c > 0. The game is a social 
dilemma, which is independent of the value of r, because participants can improve 
their payoffs by withholding their contribution. 

Let us next assume that the total incentive stipulated by a sanctioning institution 
is proportional to the group size m and hence of the form mS, where 8 > is the 
(potential) per capita incentive. If rewards are employed to incentivize cooperation, 
these funds will be shared among the so-called "cooperators" who contribute (see 
[51] for a voluntary reward fund). Hence, each cooperator will obtain a bonus that 
is denoted by m8/nc, where «c denotes the number of cooperators in the group of 
m participants. If penalties are employed to incentivize cooperation, "defectors" who 
do not contribute will analogously have their payoffs reduced by md/rio, where «d 
denotes the number of defectors in the group of m players (m = nc + «d)- 

We consider an infinitely large and well-mixed population of players, from which 
n samples are randomly selected to form a group for each game. Our analysis of the 
underlying evolutionary game is based especially on the replicator dynamics [52] for 
the three corresponding strategies of the cooperator, defector, and non-participant, 
with respective frequencies x, y, and z- The combination of all possible values of 
(x, y, z) with x, y, z > and x + y + z = 1 forms the triangular state space A. We denote 
by C, D, and N the three vertices of A that correspond to the three homogeneous 
states in which all cooperate (x = 1), defect (y = 1), or are non-participants (z = 1), 
respectively. For A, the replicator dynamics are defined by 

i = x(P> c -F), y=y(P^-P s ), z = z(P^-P s ), (1) 

where P s denotes the average payoff in the entire population; P^, P^, and P^ de- 
note the expected payoff values for cooperators, defectors, and non-participants, re- 
spectively; and s = o,r,p is used to specify one of three different incentive schemes, 
namely, "without incentives," "with rewards," and "with punishment," respectively. 
Because non-participants have a payoff of 0, = 0, and thus, P s = xPq +yPp- 
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We note that if (r— l)c > (7, the three edges of the state space 8 form a hetero- 
clinic cycle without incentives: N — >• C — >• D — >• N (Figs. 2a or 3a). Defectors domi- 
nate cooperators because of the cost of contribution c, and non-participants dominate 
defectors because of the cost of participation a. Finally, cooperators dominate non- 
participants because of the net benefit from the public good game with [r — l)c > a. 
In the interior of A, all of the trajectories originate from and converge to N, which 
is a non-hyperbolic equilibrium. Hence, cooperation can emerge only in brief bursts, 
sparked by random perturbations [29,41]. 



2.2 Pay-offs 



Here, we calculate the average payoff for the whole population and the expected 
payoff values for cooperators and defectors. In a group with m — 1 co-participants 
(m = 2, . . . ,n), a defector or a cooperator obtains from the public good game an aver- 
age payoff of rex/ (I — z) [41]. Hence, 



/£=('T T ^-<7)(1-Z"- 1 ). 



(2) 



Note that z" 1 is the probability of finding no co-players and, thus, of being reduced 
to non-participation. In addition, cooperators contribute c with a probability 1 — z" -1 , 
and thus, P^-Pg = -c(l -z"' 1 ). Hence, P° = (1 -Z n ~ l )[{r- \)cx- o(l -z)]. 

We now turn to the cases with institutional incentives. First, we consider penalties. 
Because cooperators never receive penalties, we havepP=P°.In a group in which the 
m — 1 co-participants include k cooperators (and thus, m — 1 — k defectors), switching 
from defecting to cooperating implies avoiding the penalty mS/(m — k). Hence, 



p c- p l = ( p c- p d)+ L i '" : id -zr z 
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(3) 



and thus, 



P =P°-8[y{l-z"- 1 )+x{l-(l-y)"- 1 )} 

= (\-z n - l )({r-\)cx-a{\-z)-8y)-8x{\-{\-y) n - 1 ). (4) 



Next, we consider rewards. It is now the defectors who are unaffected, implying 
= Pp. In a group with m — 1 co-participants, including k cooperators, switching 
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from defecting to cooperating implies obtaining the reward m8 / (k + 1 ) . Hence, 

'n-Y 



m=2 



fill - I 
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l-z \l-z 
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(5) 



and thus, 



= (1 -z"- l )((r- l)oc-a(l -z) + Sx) - 8y(l - (1 -x)"- 1 ). 



(6) 



3 Results 



3.1 Coordination and coexistence 



We investigated the interplay of institutional incentives and optional participation. As 
a first step, we considered replicator dynamics along the three edges of the state space 
A. On the DN-edge (x — 0), this dynamic is always D — > N because the payoff for 
non-participating is better than that for defecting by at least the participation fee a, 
regardless of whether penalties versus rewards are in place. On the NC-edge (y = 0), 
it is obvious that if the public good game is too expensive (i.e., if a > (r — l)c, under 
penalties orff> (r — l)c + 5, under rewards), players will opt for non-participation 
more than cooperation. Indeed, N becomes a global attractor because z > holds in 
A \ {z = 0}. We do not consider further cases but assume that the dynamic of the 
NC-edge is always N — > C. 

On the CD-edge (z = 0), the dynamic corresponds to compulsory participation, 
and Eq. (1) reduces tox = x(l —x)(Pq — Pq). Clearly, both of the ends C (x = 1) and 
D (x — 0) are fixed points. Under penalties, the term for the payoff difference is 

pP-Pl = -c + 8 1 —^ = -c + 8 n Tx i . (7) 
l ~ x to 

Under rewards, it is 

i/i \n n-l 

Ph-Ph = -c + 8 1 L = _ c + 5 £(i_ JC )«. (8) 

X i=0 

Because 8 > 0, Pq — P[, strictly decreases, and P^. — P^ strictly increases, with x. The 
condition under which there exists an interior equilibrium R on the CD-edge is 



8 < 8 < 8 + , with 8 = - and 8 + = c. 



(9) 
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Fig. 1 Compulsory public good games with institutional incentives. The location of stable and unstable 
equilibria (thick continuous lines and dashed lines, respectively) and the direction of evolution (dotted 
arrows) vary, depending on the per capita incentive, S. For very small and sufficiently large values of 
S, full defection (x = 0) and full cooperation (x = 1) are the final outcomes, respectively. This applies to 
both incentives considered. Intermediate values of S impact evolutionary dynamics in a strikingly different 
way, as follows, a Punishment. When S increases beyond a threshold <5_, an unstable interior equilibrium 
enters the state space at x = 1, moves left, and eventually exits it at x = for 5 = <5 + . b Rewards. When 
S increases beyond a threshold 5_ , a (globally) stable interior equilibrium enters the state space at x = 0, 
moves right, and eventually exits it at x = 1 for 5 = 5 + . Consequently, for the interval <5_ < 5 < 5 + (gray- 
colored region), punishment results in bi-stability of both the pure states; rewards lead to a stable mixture 
independent of the initial state. Parameters: n = 5, r = 3, c = 1, and a = 0.5. 



Next, we summarize the game dynamics for compulsory public good games (Fig. 
1). For such a small 8 that 8 < 5_, defection is a unique outcome; D is globally stable, 
and C is unstable. For such a large 8 that 8 > 8+, cooperation is a unique outcome; 
C is globally stable, and D is unstable. For the intermediate values of 8, cooperation 
evolves in different ways under penalties versus rewards, as follows. Under penalties 
(Fig. la), as 8 crosses the threshold 5_, C becomes stable, and an unstable interior 
equilibrium R splits off from C. The point R separates the basins of attraction of 
C and D. Penalties cause bi-stable competition between cooperators and defectors, 
which is often exhibited as a coordination game [24]; one or the other norm will 
become established, but there can be no coexistence. With increasing 8, the basin of 
attraction of D becomes increasingly smaller, until 8 attains the value of 8+. Here, R 
merges with the formerly stable D, which becomes unstable. 

In contrast, under rewards (Fig. lb), as 8 crosses a threshold 8 , D becomes 
unstable, and a stable interior equilibrium R splits off from D. The point R is a global 
attractor. Rewards give rise to the stable coexistence of cooperators and defectors, 
which is a typical result in a snowdrift game [53]. As 8 increases, the fraction of 
cooperators within the stable coexistence becomes increasingly larger. Finally, as 8 
reaches another threshold 8+, R merges with the formerly unstable C, which becomes 
stable. We note that 8+ and 8 have the same value, regardless of whether we take 
into account rewards or penalties. 
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3.2 Uniqueness of the interior equilibrium Q 

Now, we consider the interior of the state space A. We start by proving that, for n > 2, 
if an equilibrium Q exists in the interior, it is unique. For this purpose, we introduce 
the coordinate system (f,z) in A\ {z = 1}, with / = x/(x + y), and we rewrite Eq. 
(I) as 

/ = /(I z=-zP s . (10) 

Dividing the right-hand side of Eq. (10) by 1 — z" _1 , which is positive in A\ {z = 1}, 
corresponds to a change in velocity and does not affect the orbits in A [52]. Using 
Eqs. (3)-(6), this transforms Eq. (10) into the following. Under penalties, Eq. (10) 
becomes 

f = f(l-f)[-c + 8 + 8fH(f,z)}, 

z = z(l-z)[a + 8-((r-l)c + 8)f+8f(l -f)H(f,z)\, (11) 
whereas under rewards, it becomes 

/ = /(I -/)[-c + 5 + 5(l -f)H(l-f,z)], 

z = z(l -z)[a-((r-l)c + 8)f+8f(l -f)H(l -f,z)}, (12) 

where 

„ (f x Wf+cwMT 1 = 1 +[/+( 1 -/)^+---+[/+(w)^r 2 

{J,Z> (l-Ml-z"- 1 ) ' l+z+---+z"- 2 

(13) 

Note that H(f,0) = ^I 2 f and H(f, 1) = 1. 

At an interior equilibrium Q = (/q,Zq), the three different strategies must have 
equal payoffs, which, in our model, means that they all must equal 0. The conditions 
Pq = Pq = under penalties and P°=P^ = under rewards imply that /q is given 
by 

c + a a 
/q( p ) = under penalties and /q^ = — under rewards. (14) 

respectively. Thus, if it exists, the interior equilibrium Q must be located on the line 
given by / = /q. From Eqs. (1 1) and (12), Q must satisfy 

c — 8 c — S 

H(f,z) = — under penalties and//(l —f,z) = ^7 r underrewards. (15) 

of o(l- f) 

In the specific case when n = 2, by solving Eqs. (14) and (15) with H(f,z) = 1, 
we can see that the dynamic has an interior equilibrium only when 8 = rc 2 /((r + 
l)c + a) under penalties or 8 = rc 2 / (2rc — a) under rewards. At this moment, the 
aforementioned line consists of a continuum of equilibria and connects R and N (Fig. 
4). This is a degenerate case of the interior equilibrium, but in Sasaki et al. [22], this 
case was not clearly distinguished from the general form described below. 

We next show that zq is uniquely determined in the general case for n > 2. Both 
equations in Eq. (15) have at most one solution with respect to z- Because /q is 
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independent of zq, it is sufficient to show that H(f,z) is strictly monotonic for every 
z € (0, 1). We first consider penalties. A straightforward computation yields 



dz 



H(f,z) = 



n-\ 



(WXi-z"- 1 ) 2 

(n-l)z"- 2 

(WXi-z"- 1 ) 2 

7+(W)z 



r-(/+(i-/)r 2 ((i-/)+/ni 



n-2 



(l-/)+/z"- 2 

((WH/z)*" 2 



(16) 



We note that 

/+(W)z 



1 



ni -/)+./;-.) = l+/(l-/)(; 2 + -j=l+/(l-/)ii-^>l, 

(17) 



and 



(l-/)+/z"' 2 
((l-f)+fz)"- 2 



> 1. 



(18) 



This inequality obviously holds for n = 2. By induction for every larger n, if it holds 
for n, it must hold for n + 1 because 



(l-/)+/z" +1 (WH/g = /(l-/)(l-z)(l-z") 
((W)+/z)" +1 ((W)+/z)« " ((l-/)+/z) n+1 



>0. 



(19) 



Consequently, the square bracketed term in the last line of Eq. (16) is negative. Thus, 
dH(f,z)/dz < for every z £ (0, 1). We now consider rewards and use the same 
argument as above. This concludes our proof of the uniqueness of Q. 

For n > 2, as 8 increases, Q splits off from R (with x R = /q) and moves across 
the state space along the line given by Eq. (14) and finally exits this space through 
N. The function H decreases with increasing z, and the right-hand side of Eq. (15) 
decreases with increasing 5, which implies that zq increases with 8. By substituting 
Eq. (13) into Eq. (15), we find that the threshold values of 8 for Q's entrance (z = 0) 
and exit (z = 1) into the state space are respectively given by 



8 S = 



1+fiH KB"" 1 



and 5* = 



1+B' 



(20) 



where B = fq^ (and s — p) under penalties, and B = 1 — /g( r ) (and s — r) under 
rewards. We note that 8 < 8 S < 8 s < 8 + , which is an equality only for n = 2. 



3.3 The saddle point Q 



We next prove that for n > 2, Q is a saddle point. We first consider penalties using 
Eq. (11). Because the square brackets in Eq. (11) vanish at Q, the Jacobian at Q is 
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given by 



Jq 



5/(1-/) // + / 



z(l-z) 



df 



A + 8 (1-2/)// + /(!-/) 



5/ 2 (W) 



dH_ \ 

dz 



5/(1 -/Ml -z 



dH 

~dzj 



Q 
(21) 

where H = //(/, z) and A = (r - l)c + 5. Using <?//(/, z) /dz < 0, // > 0, and A > 0, 
which yields 



det/ Q = 8f 2 (l -/)z(l -z)[A + 5///(/,z)] 



dz 



<0. 



(22) 



Therefore, Q is a saddle point. 

We next consider rewards using Eq. (12). Similarly, we find that the Jacobian at 
Q is given by 



5/(1-/) -// + (!-/) 



^7 



5/(1-/) 



,c5// \ 

dz 



-z(l-z) 



A + 8 (1-2/)// + /(!-/) 



5/ 



5-/(l-/)z(l-z) 



a// 

(23) 

where H =H(l-f,z) and A is as in Eq. (21). Using dH(l-f,z)/dz < 0, H > 0, 
and A > 0, it follows again that det/Q < 0. Threrefore, Q is a saddle point. 



3.4 Classification of global dynamics 

Here, we analyze in detail the global dynamics using Eqs. (11) and (12), which are 
well defined on the entire unit square U = { (/, z) : < / < l,0<z< 1}. The induced 
mapping, cont : U — > A, contracts the edge z — 1 onto the vertex N. Note that C = 

(1.0) and D = (0,0) as well as both ends of the edge z = 1, N = (0, 1) and Ni = 

(1.1) , are hyperbolic equilibria, except when each undergoes bifurcation (as shown 
later). We note that the dynamic on the Ni No -edge is unidirectional to Nq without 
incentives. 

First, we examine penalties. From Eq. (1 1), the Jacobians at C and No are respec- 
tively given by 

(c-n8 \ , T (c-28 \ 

/c= ( -[(r-l)c-a}) and J ^={ (r-l)c-a)- (24) 

From our assumption that (r— l)c > a, it follows that if 8 < c/n, then det/c < 0, 
and thus, C is a saddle point; otherwise, det/c > and tr/c < 0, and thus, C is a sink. 
Regarding Ni, if 8 < c/2, Ni is a source (det/ N[ > and tr/ Nl > 0); otherwise, Ni is 
a saddle (det/N, < 0). Next, the Jacobians at D and No are respectively given by 

(-(c-n8) \ , . (-{c-nS) \ .... 
Jd=[ \ J a + § ) and / No =^ V q _ {a + 8) ). (25) 



10 



Tatsuya Sasaki 



If 8 < c, D is a saddle point (detlo < 0), and No is a sink (det/N > and tr/N < 0); 
otherwise, D is a source (det/o > and tr7o > 0), and No is a saddle point (det/N < 
0). 

We also analyze the stability of R. As 5 increases from c/n to c, the boundary 
repellor R = (xr, 0) enters the CD-edge at C and then moves to D. The Jacobian at R 
is given by 



^8x R (l-x R ) §jfH(f,z) 



V 







-rcxR + (c + a)J 



(26) 



Its upper diagonal component is positive because dH(f, z)/df>0 and H > 0, whereas 
the lower component vanishes at x R = /g( p ) = (c+ a) /{re). Therefore, if /q^ < 
xr < 1, R is a saddle point (det/R < 0) and is stable with respect to z; otherwise, if 
< x R < /q( p ), R is a source (det/R > and tr/o > 0). 

In addition, a new boundary equilibrium S = (xs, 1) can appear along the NiNo- 
edge. Solving /(xs, 1) = in Eq. (11) yields xs = (c — 8)/8; thus, S is unique. S is a 
repellor along the edge (as is R). As 8 increases, S enters the edge at Ni (for 8 — c/2) 
and exits it at No (for 8 = c). The Jacobian at S is given by 



Js 



Ux s (l-x s ) §jfH(f,z) 



V 







<5x| + (r - 1 )cxs — o - 8 J 



(27) 



Again, its upper diagonal component is positive. Using xs = (c— 8)/8, we find that 
the sign of the lower component changes once, from positive to negative, as 8 in- 
creases from c/2 to c. Therefore, S is initially a source (det/s > and tr/s > 0) but 
then turns into a saddle point (det/s < 0), which is stable with respect to z. 

We give a full classification of the global dynamics under penalties, as follows. 

1. For < 8 < 8 (Fig. 2a), C and D are saddle points, Ni is a source, and No is 
a sink. There is no other equilibrium, and / < holds in the interior state space. 
All interior orbits originate from Ni and converge to No. No is globally stable. 
After applying the contraction map, we find that the interior of A is filled with 
homoclinic orbits originating from and converging to N. 

2. As 8 crosses 8 (Fig. 2b), C becomes a sink, and the equilibrium R enters the CD- 
edge at C. R is unstable along that edge but is stable with respect to z. Therefore, 
there is an orbit originating from Ni and converging to R that separates the basins 
of attraction of C and No. All of the orbits in the basin of No have their a-limits 
at Ni, as before. Hence, the corresponding region in A is filled with homoclinic 
orbits and is surrounded by a heteroclinic cycle N — >• R — >• D — >• N. However, if 
the population is in the vicinity of N, small and rare random perturbations will 
eventually send the population into the basin of attraction of C (as is the case for 
c/2<8). 
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Fig. 2 Optional public good games with institutional punishment. The triangles represent the state space 
A = {(x,y,z) : x,y,z > 0,x + y + z = 1}- Its vertices C, D, and N correspond to the three homogeneous 
states of cooperators (x = 1), defectors (y = 1), and non-participants (z = 1), respectively. The unit squares 
represent an extended state space U = { (/, z) '■ < / < 1, < z < 1} such that A is its image according to 
the mapping x = /(l — z), y = (1 — /)(1 — z), which is injective except for z = 1. The edge is contracted 
to N. The vertices of U are denoted by C = (1,0), D = (0,0), N, = (1, 1), and N = (0, 1). The stream 
plot is based on Eq. (11). Dot and dashed curves in U denote where / and z vanish, respectively, a Without 
incentives, the interior of U is filled with orbits originating from Nj and then converging to No, which 
correspond to homoclinic cycles to fully cover the interior of A. b As 5 increases, the equilibrium R 
(a saddle point) first enters the CD-edge at C, which then becomes a sink, c When S crosses c/2, the 
equilibrium S (a source) enters the Ni No-edge at Nj, which then becomes a saddle point, d When 5 
crosses <5p, the saddle point Q enters the interior of A through R, which then becomes a source. Q traverses 
U along a horizontal line, e When 5 crosses <5p, Q exits A through S, which then becomes a saddle. For 
larger values of S, there is no such interior orbit that originates from the NiNo-edge and converges to 
it, and thus, A has no homoclinic cycle. When S crosses S + , R and S exit A through D, which becomes 
a source, and No, which becomes a saddle, f For S > 5 + , the interiors of U and A are filled with orbits 
originating from D and converging to C. Parameters are the same as in Fig. 1: n = 5, r = 3, c= 1, a = 0.5, 
and 5 = (a), 0.25 (b), 0.51 (c), 0.55 (d), 0.7 (e), or 1.2 (f) 
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3. As 8 crosses c/2 (Fig. 2c), Ni becomes a saddle point, and a new equilibrium S 
enters the NiNo-edge at Ni. S is a source. As 8 increases, S moves toward No. 
If c/2 < Sp holds, then for c/2 < 8 < <Sp, there is still an orbit originating from 
S and converging to R that separates the state space into basins of attraction of C 
and No. All of the orbits in the basin of No have their a-limits at Ni, as before. 
In A, the separatrix NR and the NC-edge now intersect transversally at N, and the 
entrance of a minority of participants (including cooperators and defectors) into 
the greater population of non-participants may be successful. 

4. As 8 crosses 8p (Fig. 2d), the saddle point Q enters the interior of A through 
R, which becomes a source. Based on the uniqueness of Q and the Poincare- 
Bendixson theorem ([52], Appendix A), we can see that there is no such homo- 
clinic orbit originating from and converging to Q, and the unstable manifold of 
Q must consist of an orbit converging to C and an orbit converging to No; the 
stable manifold of Q must consist of an orbit originating from D and an orbit 
originating from S (or, in the case that 8 p < c/2, from Ni for 5p < 8 < c/2). 
The stable manifold separates the basins of attraction of C and No; the unstable 
manifold separates the basin for No into two regions. One of them is filled with 
orbits originating from S (or from Ni under the above conditions) and converging 
to Nq. For A, this means that the corresponding region is filled with homoclinic 
orbits and is surrounded by a heteroclinic cycle N — >• Q — >• N (Fig. 2d). As 8 
further increases, Q moves across U, from the CD-edge to the NjNo-edge along 
the line / = /q( p ) . For n = 2, R and S undergo bifurcation simultaneously, and 
the linear continuum of interior equilibria, which connects R and S, appears only 
at the bifurcation point (Fig. 4a). 

5. As 8 crosses 5 P (Fig. 2e), Q exits the state space through S, which then becomes 
saturated. For larger values of 8, there is no longer an interior equilibrium. S is a 
saddle point, which is connected with the source R by an orbit leading from R to 
S. 

6. Finally, as 8 crosses 8 + (Fig. 2f), R and S simultaneously exit U, through D and 
N , respectively. For 8 + < 8, Ni and N are saddle points, D is a source, and C is 
a sink. / > holds throughout the state space. All of the interior orbits originate 
from D and converge to C. Hence, C is globally stable. 

Let us now turn to rewards. From Eq. (12), the Jacobians at D and No are 

f-(c-nS)0\ , T {-(c-28) 0\ 
Ju=[ a ) and J No =^ Q > _ a }. (28) 

If 8 < c/n, D is a saddle point (det/o < 0); otherwise, D is a source (det/p > 
and tr/ D > 0). Regarding No, if 8 < c/2, No is a sink (det/ No > and tr/ No < 0); 
otherwise, No is a saddle point (det/N < 0)- Meanwhile, the Jacobians at C and Ni 
are 

Jc = ( C 5 -[(r-l)c-a + 5]) and Jn ^{ C ~0 S (r-l)c°-G + 8)- (29) 

From (r — 1 )c> a — 8, it follows that if 8 < c, C is a saddle point (det/c < 0), and N i 
is a source (det^ > and tr/^ > 0); otherwise, C is a sink (det/c > and tr/c < 0), 
and Ni is a saddle point (det/ N , < 0). 
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We also analyze the stability of R. As 8 increases from c/n to c, the boundary 
attractor R enters the CD-edge at D and then moves toward C. The Jacobian at R is 
given by 



Jr = 



/-Sxr(I-xr) ^ (1 -f)H(l -f, z ) 



V 







\ 



-rcxR + a J 



(30) 



Its upper diagonal component is negative because dH(l — f,z)/df < and H > 0, 
and the lower component vanishes at xr = /Q( r ) = a j {re). Therefore, if < xr < 
/Q( r ), R is a saddle point (det/ R < 0) and unstable with respect to z; otherwise, if 
/Q(r) < X R< 1, R is a sink (det/R > and tr/R < 0). 

Similarly, a boundary equilibrium S can appear along the N] No-edge. Solving 
/(xs,l) = in Eq. (12) yields xs = 1 — (c — 8)/8, and thus, S is unique. S is an 
attractor along the edge (as is R). As 8 increases, S enters the edge at No (for 8 = c/2) 
and exits at Ni (for 8 — c). The Jacobian at S is 



^-5*S(1-XS) jj(l-f)H(l-f,z) 



V 



-[8xl-{{r-l)c + 28)x s - 

(31) 

Again, its upper diagonal component is positive. Using xs = 1 — (c — 8)/ 8, we find 
that the sign of the lower component changes once, from negative to positive, as 8 
increases from c/2 to c. Therefore, S is initially a sink (det/s > and tr/s < 0) and 
then becomes a saddle point (det/s < 0), which is unstable with respect to z. 
A full classification of the global dynamics under rewards is as follows. 

1. For < 8 < 8 (Fig. 3a), C and D are again saddle points, Ni is a source, and 
No is a sink. / < holds in the interior state space, and all of the interior orbits 
originate from Ni and converge to No. The interior of A is filled with homoclinic 
orbits originating from and converging to N. 

2. As 8 crosses 8 (Fig. 3b), D turns into a source, and the saddle point R enters the 
CD-edge through D. There exists an orbit originating from R and converging to 
No. In contrast to the case with penalties, No remains a global attractor. A region 
separated by the orbit RNo encloses orbits with Nj as their a-limit. Therefore, in 
A, the corresponding region is filled with homoclinic orbits that are surrounded 
by a heteroclinic cycle N — >• C — >• R — >• N. 

3. As 8 crosses c/2, No becomes a saddle point, and the equilibrium S enters the 
NiNo-edge at No. S is a sink (and thus, a global attractor). As 8 increases, S 
moves to Ni. If c/2 < 5r holds, then for c/2 < 8 < 5r, there exists an orbit orig- 
inating from R and converging to S, which separates the interior state space into 
two regions. One of these regions consists of orbits originating from Ni, corre- 
sponding in A to a region filled with homoclinic orbits. The other region consists 
of orbits originating from D. In A, the separatrix RN and the NC-edge intersect 
transversally at N. 
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Fig. 3 Optional public good games with institutional rewards. Notations are as in Fig. 2, and the stream 
plot is based on Eq. (12). a Without incentives, this figure is same as Fig. la. b As 5 increases, the 
equilibrium R (a saddle point) first enters the CD-edge at D, which then becomes a source, c When 5 
crosses 5, , the saddle point Q enters the interior of A through R, which then becomes a sink. Q traverses 
U along a horizontal line, d When 5 crosses c/2, the rest point S (a sink) enters the NiNo-edge at No, 
which then becomes a saddle point, e When S crosses 8', Q exits U through S, which then becomes a 
saddle point. For larger values of S, there is no such interior orbit that originates from the Ni No-edge 
and converges to it and, thus, A has no homoclinic cycle. When S crosses 5 + , R and S exit A through C, 
which becomes a sink, and Ni, which becomes a saddle, f For S > S + , the interiors of U and A are filled 
with orbits originating from D and then converging to C, as in the case with institutional punishment. The 
parameters are the same as in Figs. 1 and 2, except 5 = (a), 0.25 (b), 0.35 (c), 0.52 (d), 0.7 (e), or 1.2 (f) 
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4. As 8 crosses 8 r (Fig. 3d), the saddle point Q enters the interior state space through 
R, which then becomes a sink. There is no homoclinic loop for Q, as before, and 
now, we find that the stable manifold of Q must consist of two orbits originating 
from D and N\. The unstable manifold of Q must consist of an orbit converging 
to R and another converging to S or, in the case that Sr < c/2, converging to No 
for 8r < 8 < c/2 (Fig. 3c). The stable manifold separates the basins of attraction 
of R and S (or No under the above conditions); the unstable manifold separates 
the basin for S (or No) into two regions. One of these regions is filled with orbits 
issuing from Ni and converging to S (or No). The corresponding region in A is 
filled with homoclinic orbits and is surrounded by a heteroclinic cycle N — >• Q — >• 
N (Figs. 3c and 3d). As 8 continues to increase, Q moves through U, from the 
CD-edge to N i No, along the line / = /qm . For n = 2, R and S undergo bifurcation 
simultaneously, and the continuum of interior equilibria, which connects R and 
S, appears only at the bifurcation point (Fig. 4b). 

5. As 8 crosses S 1 (Fig. 3e), Q exits the state space through S, which then becomes 
a saddle point. For larger values of 8, there is no longer an interior equilibrium. 
S is connected with the sink R by an orbit from S to R. All of the interior orbits 
converge to R. 

6. Finally, as 8 crosses 8 + (Fig. 3f), R and S simultaneously exit U through C and 
Ni, respectively. Just as in the case with punishment, for 8 + < 8, Ni and No are 
saddle points, and D is a source. Finally, C is a sink. / > holds throughout the 
state space. All of the interior orbits originate from D and then converge to C. 
Hence, C is globally stable. 



4 Discussion 

We considered a model for the evolution of cooperation through institutional incen- 
tives and analyzed in detail evolutionary game dynamics. Specifically, based on a 
public good game with optional participation, we fully analyzed how opting-opt im- 
pacts game dynamics; in particular, opting-out can completely relax a coordination 
problem associated with punishment for a considerably broader range of parameters 
than in cases of compulsory participation. 

We start from assuming that there is a state-like institution that takes exclusive 
control of individual-level sanctions in the form of penalties and rewards. In our ex- 
tended model, nobody is forced to enter a joint enterprise that is protected by the 
institutional sanctioning, however, whoever is willing to enter, must be charged at 
the entrance. Further, if one proves unable or unwilling to pay, the sanctioning insti- 
tution can ban that person from participation in the game. Indeed, joint ventures in 
real life are mostly protected by enforceable contracts in which members can freely 
participate, but are bound by a higher authority. For example, anyone can opt to not 
participate in a wedding vow (with donating to the temple or church), but once it is 
taken, it is the strongest contract among enforceable contracts. As far as we know, 
such higher authorities always demand penalties if contracts are broken. 

Based on our mathematical analysis, we argue that institutional punishment, rather 
than institutional rewards, can become a more viable incentivization scheme for coop- 
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Fig. 4 Optional public good games with institutional incentives for n = 2. Notations are as in Fig. 2, and 
the stream plot is given under (a) penalties based on Eq. (11) and (b) rewards based on Eq. (12). Other 
parameters include c = 1, o" = 0.5, and r — 3 (a) or 1 (b). For n = 2, the state space may have an interior 
equilibrium, which is a linear continuum of the equilibria, only at S = 2/3. a Under penalties, the fixed 
line that connects N (S in U) and R is repelling with respect to f and divides A into basins of attraction of 
N (No in U) and C. From the vicinity of N, arbitrarily small random perturbations will send the state into 
the region of attraction of C. b Under rewards, the fixed line is attracting with respect to /, and thus, the 
interior orbits converge to corresponding points on the line 

eration when combined with optional participation. We show that combining optional 
participation with rewards can complicate the game dynamics, especially if there is 
an attractor with all three strategies: cooperation, defection, and non-participation, 
present. This can only marginally improve group welfare for a small range of per 
capita incentive 8, with 8 < 8 < 5,- (Fig. 3b). Within this interval, compulsory par- 
ticipation can lead to partial cooperation; however, optional participation eliminates 
the cooperation and thus drives a population into a state in which all players exit. 
Hence, freedom of participation is not a particularly effective way of boosting coop- 
eration under a rewards scenario. 

Under penalties, the situation varies considerably. Indeed, as soon as 8 > 8 (Fig. 
2b), the state in which all players cooperate abruptly turns into a global attractor for 
optional participation. When 8 just exceeds 5_, group welfare becomes maximum 
(r— l)c— a. Meanwhile, for compulsory participation, almost all of the (boundary) 
state space between cooperation and defection still belongs to the basin of attraction 
of the state in which all players defect. Because 8 = c/n, where n is the group size, 
and c is the net contribution cost (a constant), when n is larger, the minimal institu- 
tional sanctioning cost 8 to establish full cooperation is smaller. 

There are various approaches to equilibrium selection in n-person coordination 
games for binary choices [54-56]. A strand of literature bases stochastic evolution 
models [57-59], in which typically, a risk-dominant equilibrium [60] that has the 
larger basin of attraction is selected through random fluctuation in the long run. In 
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contrast, considering optional participation, our model typically selects the cooper- 
ative equilibrium which provides the higher group welfare, even if the cooperative 
equilibrium has the smaller basin of attraction when participation is compulsory than 
has the defective equilibrium. In the sense of favoring the efficient equilibrium, our 
result is similar to that found in a decentralized partner-changing model proposed by 
Oechssler [61], in which players may occasionally change interaction groups. 

Throughout centralized institutional sanctions mentioned so far, norm-based co- 
operation is less likely to suffer from higher-order freeloaders, which have been prob- 
lematic in modeling decentralized peer-to-peer sanctions [2,62]. In addition, it is clear 
that sanctioning institutions will stipulate a lesser antisocial punishment targeted at 
cooperators [63], which can prevent the evolution of pro-social behaviors ([64,65], 
see also [36]). Indeed, punishing cooperators essentially promote defectors, who will 
reduce the number of participants willing to pay for social institutions. For self- 
sustainability, thus, sanctioning institutions should dismiss any antisocial schemes 
that may lead to a future reduction in resources for funding the institution. 

Thus, we find that our model reduces the space of possible actions into a very 
narrow framework of alternative strategies, in exchange for increasing the degree of 
the institutions complexity and abstractiveness. In practice, truly chaotic situations 
which offer a very long list of possibilities are unfeasible and create inconvenience, 
as is described by Michael Ende in "The Prison of Freedom" [1992]. Participants in 
all economic experiments usually can make their meaningful choices only in a short 
and regulated list of options, as is the way with us in real life. Our result indicates 
that a third party capable of exclusively controlling incentives and membership can 
play a key role in selecting a cooperative equilibrium without ex ante adjustment. The 
question of how such a social order can emerge out of a world of chaos is left entirely 
open. 
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Appendix A 

First, we prove that a homoclinic loop that originates from and converges to Q does 
not exist. Using the Poincare-Bendixson theorem [52] and the uniqueness of an in- 
terior equilibrium, we show that if it does exist, there must be a point p inside the 
loop such that both of its a- and co-limit sets include Q. This contradicts the fact that 
Q is a saddle point. Indeed, there may be a section that cuts through Q such that the 
positive and negative orbits of p infinitely often cross it; however, it is impossible for 
a sequence consisting of all the crossing points to originate from and also converge 
to the saddle point Q. Hence, there is no homoclinic orbit of Q. 

Next, we show that orbits that form the unstable manifold of Q do not converge 
to the same equilibrium (indeed, this is a sink). If they do, the closed region that is 
surrounded by the orbits must include a point q such that its co-limit set is Q. Using 
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the Poincare-Bendixson theorem and the uniqueness of an interior equilibrium, the 
a-limit set for q must include Q; this is a contradiction. Similarly, we can prove that 
the orbits that form the stable manifold of Q do not issue from the same equilibrium. 
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